You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for the leaderboard update, @cocohearts. A couple of FYIs for context:
#1797 → #1855. Just a heads-up: while the base #1797 PR has the validity concern you raised in your audit comment, the downstream PR #1855 — which is built on the #1797 stack — is itself valid, because that specific concern is fixed there. #1855 applies the not_bos = (input_ids[:, 1:] != BOS_ID) mask in both _forward_hidden and forward_ttt, exactly as your audit recommended. 3-seed mean: 1.06108 BPB (std 0.00090), independently reproduced by @okezue.
#1530. For reference, this PR has its own structural concern open in its thread — the TTT compile warmup runs backward() / step() on actual validation tokens before the main eval loop, which @dexhunter and @msisovic flagged as structurally matching the pattern called out in #677. @samacqua confirmed the gap is within run-to-run variance and offered a synthetic-token warmup as the fix, but the merged head still appears to use val tokens for the compile warmup. Whether this rises to the same kind of validity blocker that was applied to #1797 is the maintainers' call, but flagging it explicitly since the structural pattern (adapt-on-validation-before-the-reported-eval-pass, per #677) looks similar.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
README-only leaderboard update adding p<0.25 accepted chain #1518/#1530/#1610/#1626/#1667/#1784; #1787/#1797/#1801 intentionally excluded due validity/provenance blockers.